AITopics | Hampshire County

Collaborating Authors

Hampshire County

Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation 1 and Daniel Sheldon College of Information and Computer Sciences, University of Massachusetts Amherst

Neural Information Processing SystemsMar-26-2025, 01:52:05 GMT

Recent work in variational inference (VI) uses ideas from Monte Carlo estimation to tighten the lower bounds on the log-likelihood that are used as objectives. However, there is no systematic understanding of how optimizing different objectives relates to approximating the posterior distribution. Developing such a connection is important if the ideas are to be applied to inference--i.e., applications that require an approximate posterior and not just an approximation of the log-likelihood. Given a VI objective defined by a Monte Carlo estimator of the likelihood, we use a "divide and couple" procedure to identify augmented proposal and target distributions. The divergence between these is equal to the gap between the VI objective and the log-likelihood. Thus, after maximizing the VI objective, the augmented variational distribution may be used to approximate the posterior distribution.

artificial intelligence, estimator, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Add feedback

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Neural Information Processing SystemsMar-25-2025, 21:00:39 GMT

What latent features are encoded in language model (LM) representations? Recent work on training sparse autoencoders (SAEs) to disentangle interpretable features in LM representations has shown significant promise. However, evaluating the quality of these SAEs is difficult because we lack a ground-truth collection of interpretable features that we expect good SAEs to recover. We thus propose to measure progress in interpretable dictionary learning by working in the setting of LMs trained on chess and Othello transcripts. These settings carry natural collections of interpretable features--for example, "there is a knight on F3"-- which we leverage into supervised metrics for SAE quality. To guide progress in interpretable dictionary learning, we introduce a new SAE training technique, p-annealing, which improves performance on prior unsupervised metrics as well as our new metrics.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Hampshire County (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games > Chess (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Importance Weighting and Variational Inference Justin Domke 1 and Daniel Sheldon College of Information and Computer Sciences, University of Massachusetts Amherst

Neural Information Processing SystemsMar-23-2025, 19:59:46 GMT

Recent work used importance sampling ideas for better variational bounds on likelihoods. We clarify the applicability of these ideas to pure probabilistic inference, by showing the resulting Importance Weighted Variational Inference (IWVI) technique is an instance of augmented variational inference, thus identifying the looseness in previous work. Experiments confirm IWVI's practicality for probabilistic inference. As a second contribution, we investigate inference with elliptical distributions, which improves accuracy in low dimensions, and convergence in high dimensions.

artificial intelligence, inference, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.40)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Thinking Forward: Memory-Efficient Federated Finetuning of Language Models

Neural Information Processing SystemsMar-23-2025, 01:30:53 GMT

Finetuning large language models (LLMs) in federated learning (FL) settings has become increasingly important as it allows resource-constrained devices to finetune a model using private data. However, finetuning LLMs using backpropagation requires excessive memory (especially from intermediate activations) for resource-constrained devices. While Forward-mode Auto-Differentiation (AD) can significantly reduce memory footprint from activations, we observe that directly applying it to LLM finetuning results in slow convergence and poor accuracy.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Flow: Per-instance Personalized Federated Learning

Neural Information Processing SystemsMar-21-2025, 11:04:05 GMT

We provide theoretical analysis on the convergence of Flow and empirically demonstrate the superiority of Flow in improving clients' accuracy compared

artificial intelligence, global model, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Shlomo Zilberstein wins the 2025 ACM/SIGAI Autonomous Agents Research Award

AIHubMar-19-2025, 14:36:20 GMT

This prestigious award is made for excellence in research in the area of autonomous agents. It is intended to recognize researchers in autonomous agents whose current work is an important influence on the field. Professor Shlomo Zilberstein was recognised for his work establishing the field of decentralized Markov Decision Processes (DEC-MDPs), laying the groundwork for decision-theoretic planning in multi-agent systems and multi-agent reinforcement learning (MARL). The selection committee noted that these contributions have become a cornerstone of multi-agent decision-making, influencing researchers and practitioners alike. Shlomo Zilberstein is Professor of Computer Science and former Associate Dean of Research at the University of Massachusetts Amherst. He is a Fellow of AAAI and the ACM, and has received numerous awards, including the UMass Chancellor's Medal, the IFAAMAS Influential Paper Award, and the AAAI Distinguished Service Award.

artificial intelligence, shlomo zilberstein win, sigai autonomous agent research award

AIHub

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.30)

Genre: Personal > Honors (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs

Feng, Wanyong, Tran, Peter, Sireci, Stephen, Lan, Andrew

arXiv.org Artificial IntelligenceMar-11-2025

The difficulty of multiple-choice questions (MCQs) is a crucial factor for educational assessments. Predicting MCQ difficulty is challenging since it requires understanding both the complexity of reaching the correct option and the plausibility of distractors, i.e., incorrect options. In this paper, we propose a novel, two-stage method to predict the difficulty of MCQs. First, to better estimate the complexity of each MCQ, we use large language models (LLMs) to augment the reasoning steps required to reach each option. We use not just the MCQ itself but also these reasoning steps as input to predict the difficulty. Second, to capture the plausibility of distractors, we sample knowledge levels from a distribution to account for variation among students responding to the MCQ. This setup, inspired by item response theory (IRT), enable us to estimate the likelihood of students selecting each (both correct and incorrect) option. We align these predictions with their ground truth values, using a Kullback-Leibler (KL) divergence-based regularization objective, and use estimated likelihoods to predict MCQ difficulty. We evaluate our method on two real-world \emph{math} MCQ and response datasets with ground truth difficulty values estimated using IRT. Experimental results show that our method outperforms all baselines, up to a 28.3\% reduction in mean squared error and a 34.6\% improvement in the coefficient of determination. We also qualitatively discuss how our novel method results in higher accuracy in predicting MCQ difficulty.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.08551

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Education > Assessment & Standards (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The study of short texts in digital politics: Document aggregation for topic modeling

Nakka, Nitheesha, Yalcin, Omer F., Desmarais, Bruce A., Rajtmajer, Sarah, Monroe, Burt

arXiv.org Artificial IntelligenceMar-6-2025

Statistical topic modeling is widely used in political science to study text. Researchers examine documents of varying lengths, from tweets to speeches. There is ongoing debate on how document length affects the interpretability of topic models. We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus. In our study, we analyze one million tweets by U.S. state legislators from April 2016 to September 2020. We find that for documents aggregated at the account level, topics are more associated with individual states than when using individual tweets. This finding is replicated with Wikipedia pages aggregated by birth cities, showing how document definitions can impact topic modeling results.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.05065

Country:

North America > United States > California (1.00)
Europe (1.00)
Asia (1.00)
(11 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Media > Music (1.00)
Media > Film (1.00)
(15 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph

ZDNetMar-5-2025, 16:30:54 GMT

Some of the flashiest achievements in artificial intelligence in the past decade have come from a technique by which the computer acts randomly from a set of choices and is rewarded or punished for each correct or wrong move. It's the technique most famously employed in AlphaZero, Google DeepMind's 2016 program that achieved mastery at the games of chess, shogi, and Go in 2018. The same approach helped the AlphaStar program achieve "grandmaster" play in the video game Starcraft II. On Wednesday, two AI scholars were rewarded for advancing so-called reinforcement learning, a very broad approach to how a computer proceeds in an unknown environment. Andrew G. Barto, professor emeritus in the Department of Information and Computer Sciences at the University of Massachusetts, Amherst, and Richard S. Sutton, professor of computer science at the University of Alberta, Canada, were jointly awarded the 2025 Turing Award by the Association for Computing Machinery.

artificial intelligence, deep learning, machine learning, (12 more...)

ZDNet

Country:

North America > Canada > Alberta (0.56)
North America > United States > Massachusetts > Hampshire County > Amherst (0.25)

Genre: Personal > Honors > Award (0.71)

Industry:

Leisure & Entertainment > Games > Chess (0.93)
Leisure & Entertainment > Games > Computer Games (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Andrew Barto and Richard Sutton win Turing award for AI training trick

New ScientistMar-5-2025, 10:00:42 GMT

Andrew Barto and Richard Sutton have won the 2024 Turing award, which is often called the Nobel prize of computing, for their fundamental work on ideas in machine learning that later proved crucial to the success of artificial intelligence models such as Google DeepMind's AlphaGo. Barto, who is now retired and lives in Cape Cod, Massachusetts, didn't even realise he was nominated for the award. "I joined a Zoom with some people and was told and I was…

artificial intelligence, deep learning, machine learning, (4 more...)

New Scientist

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Personal > Honors > Award (0.79)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback